Chats and Groups
To proceed to a full recovery and exploration of the chat history by means of analyzing the msgstore
file, we firstly need to understand how to distinguish between group chats and one-to-one conversations.
Key tables and relations
chat
: this table contains information about the various conversations you had, both groups and one-to-one chats (even hidden ones). In case of a group, it will also contain the group name (or subject), while it won't directly include the phone number in case of a single conversation, but it will help us to get that information.- important columns for this query:
_id
,jid_row_id
,subject
,hidden
- important columns for this query:
jid
: this table contains a record for each entity that can act in a WhatsApp system and associates an identifying string called jid to each entity.- By entity I mean both users and groups, but users are further specified by their device. In fact, if you change your device (or, I believe, just uninstall and re-install WhatsApp at some point) other people's WhatsApp systems will see you as a different entity (maybe because your end-to-end encryption key changes).
- the
user
field stores phone numbers. - important columns for this query:
_id
,user
,type
,raw_string
Additional notes
- the column of
jid
calledraw_string
is a combination of its other columns and is the identifying string for the entity referenced in that record. - each
raw_string
is composed as<user>.<agent>:<device>@<server>
- if
device
is0
, then it's just<user>@<server>
user
is the phone number in case of a 1-to-1 chat, and it's composed as<phone number of the group creator>-<timestamp of group creation>
in the case of a group- I don't know what
agent
is, but it seems to always have value0
except for those records that haveserver
equal tolid
(I don't know what these are either, but I'm confident enough that these are neither users or groups). server
is equal tog.us
for groups and tos.whatsapp.net
for single users, but other values do exist that I still don't know how to use:broadcast
,newsletter
,lid
,lid_me
,temp
,status_me
.device
identifies the current device (I believe it's better to talk about the current app installation) the specific user is running on.
- if
type
injid
can have many different values (they don't seem to be a contiguous range, as there are some values I never encountered), from0
to21
.- The only thing I came up with is that
0
means regular user,1
means group.
- The only thing I came up with is that
- the
chat
table only referencesjid
records that havedevice
equal to0
. - you will note that Channels are treated as normal Groups in the local database, so we don't have to discuss them further: you would just find each Topic of the Channel memorized as a different group in the various tables.
Query
CREATE VIEW AUX_conversation AS
SELECT
chat._id AS id,
chat.hidden AS is_hidden,
jid._id AS recipient_jid_row_id,
jid.raw_string AS recipient_jid,
COALESCE(chat.subject, jid.user) AS displayed_name,
jid.type AS is_group
FROM
chat JOIN jid ON chat.jid_row_id = jid._id
WHERE jid.type = 0 OR jid.type = 1; -- discarding values of unknown meaning
We can now proceed to recover messages!